Comparing and Extracting Paraphrasing Words with 2-Way Bilingual Dictionaries

نویسندگان

  • Kazutaka Takao
  • Kenji Imamura
  • Hideki Kashioka
چکیده

We analyze a variety of lexical expressions with 2-way bilingual dictionaries and propose a method for extracting paraphrasing words. First, we compare the coverage between an English-Japanese dictionary and a Japanese-English dictionary from the viewpoint of the returnability of the words by translating English to Japanese, and then back to English again. The variety is shown using examples. Next, we propose a method of automatically extracting English paraphrasing word groups; we gathered the English index words which have the same Japanese translation words in the E-J dictionary. The English words which are difficult to distinguish for native speakers of Japanese were then extracted into a paraphrasing group. We also extract the Japanese paraphrasing word groups for comparison. This method will be useful for sentence matching, especially in order to accept the variety of expressions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Automatic Construction of a Japanese-Chinese Dictionary via English

This paper proposes a method of constructing a dictionary for a pair of languages from bilingual dictionaries between each of the languages and a third language. Such a method would be useful for language pairs for which wide-coverage bilingual dictionaries are not available, but it suffers from spurious translations caused by the ambiguity of intermediary third-language words. To eliminate spu...

متن کامل

Extracting Bilingual Persian Italian Lexicon from Comparable Corpora Using Different Types of Seed Dictionaries

Ebrahim Ansari ([email protected]) et al. 2017. Extracting bilingual per-sian italian lexicon from comparable corpora using different types of seed dictionaries. In " Applications of Comparable Corpora " edited book Berlin Linguistic Press (ed.). Bilingual dictionaries are very important in various fields of natural language processing. In recent years, research on extracting new bilingual lex...

متن کامل

Model in Word

Extracting bilingual dictionaries from corpora can be seen as a very fine-grained alignment process, where the aligned units are not paragraphs or sentences but words and phrases. Most approaches to this problem rely on statistical means to build translation lexicons from bilingual texts, roughly falling into two categories: the hypotheses testing approach and the estimating approach. There are...

متن کامل

Bilingual Text, Matching using Bilingual Dictionary and Statistics

This paper describes a unified framework for bilingnal text matching by combining existing hand-written bilingual dictionaries and statistical techniques. The process of bilingual text matching consists of two major steps: sentence alignment and structural matching of bilingual sentences. Statistical techniques are apt plied to estimate word correspondences not included in bilingual dictionarie...

متن کامل

Utilizing Contextually Relevant Terms in Bilingual Lexicon Extraction

This paper demonstrates one efficient technique in extracting bilingual word pairs from non-parallel but comparable corpora. Instead of using the common approach of taking high frequency words to build up the initial bilingual lexicon, we show contextually relevant terms that co-occur with cognate pairs can be efficiently utilized to build a bilingual dictionary. The result shows that our model...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002